IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Quality Preserving Compression of a Concatenative Text-To- Speech Acoustic Database
نویسندگان
چکیده
A Concatenative Text-To-Speech (CTTS) synthesizer requires a large acoustic database for high quality speech synthesis. This database consists of many acoustic leaves, each containing a number of short, compressed, speech segments. In this paper we propose two algorithms for re-compression of the acoustic database, by re-compressing the data in each acoustic leaf, without compromising the perceptual quality of the obtained synthesized speech. This is achieved by exploiting the redundancy between speech frames and speech segments in the acoustic leaf. The first approach is based on a vector polynomial Temporal Decomposition. The second is based on 3D Shape-Adaptive DCT, followed by optimized quantization. In addition we propose a segment ordering algorithm in an attempt to improve overall performance. The developed algorithms are generic and may be applied to a variety of compression challenges. When applied to compressed spectral amplitude parameters of a specific IBM small footprint CTTS database, we obtain a re-compression factor of 2 without any perceived degradation in the quality of the synthesized speech.
منابع مشابه
IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES A Hybrid Text-to-Speech System that Combines Concatenative and Statistical Synthesis Units
Concatenative synthesis and statistical synthesis are the two main approaches to text-to-speech (TTS) synthesis. Concatenative TTS (CTTS) stores natural speech features segments, selected from a recorded speech database. Consequently, CTTS systems enable speech synthesis with natural quality. However, as the footprint of the stored data is reduced, desired segments are not always available in t...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Erasure/List Exponents for Slepian-Wolf Decoding
متن کامل
IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Universal Decoding for Gaussian Intersymbol Interference Channels
متن کامل
IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Gaussian beams scattered from different materials
متن کامل
IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES On the Statistical Physics of Directed Polymers in a Random Medium and Their Relation to Tree-Structed Lossy Compression
Using well–known results from statistical physics, concerning the almost–sure behavior of the free energy of directed polymers in a random medium, we prove that a certain ensemble of tree–structured rate–distortion codes with delayless decoding, asymptotically achieves the rate–distortion function under a certain symmetry condition.
متن کامل